Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Shuffle(). #85129

Merged
merged 3 commits into from
Apr 22, 2023
Merged

Adding Shuffle(). #85129

merged 3 commits into from
Apr 22, 2023

Conversation

DeepakRajendrakumaran
Copy link
Contributor

This PR adds Vector512.Shuffle().

  1. Vector256.Shuffle() has a bug when index is out of range. I didn't change it for now. I worked around it for Vector512
  2. Vector512.IsHardwareAccelerated already returns 1

@dotnet/avx512-contrib

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Apr 20, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 20, 2023
@ghost
Copy link

ghost commented Apr 20, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

This PR adds Vector512.Shuffle().

  1. Vector256.Shuffle() has a bug when index is out of range. I didn't change it for now. I worked around it for Vector512
  2. Vector512.IsHardwareAccelerated already returns 1

@dotnet/avx512-contrib

Author: DeepakRajendrakumaran
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@BruceForstall
Copy link
Member

  1. Vector256.Shuffle() has a bug when index is out of range. I didn't change it for now. I worked around it for Vector512

Can you open an issue for this?

@BruceForstall BruceForstall added the avx512 Related to the AVX-512 architecture label Apr 20, 2023
@DeepakRajendrakumaran
Copy link
Contributor Author

DeepakRajendrakumaran commented Apr 20, 2023

  1. Vector256.Shuffle() has a bug when index is out of range. I didn't change it for now. I worked around it for Vector512

Can you open an issue for this?

I could either do that or fix it. Basically, this assert doesn't make sense and the way we copy vector probably needs a slight change
https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/gentree.cpp#L23171

I believe I just need to change it to
image

I'm okay with either solution. Just wanted to make sure I did not misunderstand before changing the assert

Edit : pointing to correct line

@tannergooding
Copy link
Member

I could either do that or fix it

Just iterating what we chatted about on Teams. We'll want to fix it here, but if you could open an issue for it as well that'd be great, as I need to look at backporting the fix for .NET 7

Thankfully its a small fix (basically 1-line change + tweaking the assert slightly).

@DeepakRajendrakumaran
Copy link
Contributor Author

I could either do that or fix it

Just iterating what we chatted about on Teams. We'll want to fix it here, but if you could open an issue for it as well that'd be great, as I need to look at backporting the fix for .NET 7

Thankfully its a small fix (basically 1-line change + tweaking the assert slightly).

I'll add a fix here to this PR.

For backporting - #85132

@DeepakRajendrakumaran DeepakRajendrakumaran marked this pull request as ready for review April 21, 2023 17:08
@DeepakRajendrakumaran DeepakRajendrakumaran force-pushed the shuffle branch 2 times, most recently from a8618c2 to ac63e4a Compare April 21, 2023 17:21
@BruceForstall
Copy link
Member

Do we need new tests to go along with this?

@tannergooding It looks like we're missing tests for lots of the AVX512 intrinsics?

@tannergooding
Copy link
Member

@tannergooding It looks like we're missing tests for lots of the AVX512 intrinsics?

This is just doing light-up for Vector512<T> which has existing tests covering all exposed APIs under:

-- Noting, they are generated at build time using the tables in https://github.com/dotnet/runtime/blob/main/src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_General.cs#L2016-L2693

We also have libraries side tests for those APIs here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Runtime.Intrinsics/tests/Vectors/Vector512Tests.cs

The platform specific intrinsics have coverage here: https://github.com/dotnet/runtime/tree/main/src/tests/JIT/HardwareIntrinsics/X86_Avx512 and are generated from https://github.com/dotnet/runtime/blob/main/src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_X86.cs#L1094-L1320

I've been ensuring we have test coverage for all the newly introduced things. We do want to add a regression test for the bug fix this PR includes, however. @DeepakRajendrakumaran That should go under https://github.com/dotnet/runtime/tree/main/src/tests/JIT/Regression/JitBlue and follow the general template as the others (name it Runtime_85132, add the validation logic in question, and have it return 100 as the exit code for success).

@BruceForstall
Copy link
Member

@tannergooding
Copy link
Member

tannergooding commented Apr 22, 2023

They are mostly on the library side due to complexity in the cross platform implementation: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Runtime.Intrinsics/tests/Vectors/Vector512Tests.cs#L2263-L3370

The platform specific shuffle APIs (which use the same general instructions) are a bit less complex and are table generated:

@DeepakRajendrakumaran
Copy link
Contributor Author

@tannergooding It looks like we're missing tests for lots of the AVX512 intrinsics?

This is just doing light-up for Vector512<T> which has existing tests covering all exposed APIs under:

-- Noting, they are generated at build time using the tables in https://github.com/dotnet/runtime/blob/main/src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_General.cs#L2016-L2693

We also have libraries side tests for those APIs here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Runtime.Intrinsics/tests/Vectors/Vector512Tests.cs

The platform specific intrinsics have coverage here: https://github.com/dotnet/runtime/tree/main/src/tests/JIT/HardwareIntrinsics/X86_Avx512 and are generated from https://github.com/dotnet/runtime/blob/main/src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_X86.cs#L1094-L1320

I've been ensuring we have test coverage for all the newly introduced things. We do want to add a regression test for the bug fix this PR includes, however. @DeepakRajendrakumaran That should go under https://github.com/dotnet/runtime/tree/main/src/tests/JIT/Regression/JitBlue and follow the general template as the others (name it Runtime_85132, add the validation logic in question, and have it return 100 as the exit code for success).

Added test.

@@ -23176,7 +23176,7 @@ GenTree* Compiler::gtNewSimdShuffleNode(
#if defined(TARGET_XARCH)
uint8_t control = 0;
bool crossLane = false;
bool needsZero = varTypeIsSmallInt(simdBaseType);
bool needsZero = varTypeIsSmallInt(simdBaseType) && (simdSize != 64);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct for the PR, but just noting we'll want to update the other simdSizes to use the full-width permutations as well, long term.

@tannergooding tannergooding merged commit 39b5b8e into dotnet:main Apr 22, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI avx512 Related to the AVX-512 architecture community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants